Evaluating Statistical Machine Translation from English to Dutch
نویسنده
چکیده
In this paper, I attempt to evaluate the effectiveness of using statistical machine translation to translate an English text into Dutch, using empirical evaluation and the Bleu evaluation metric. I also give a brief overview of the theory behind statistical machine translation and automated translation evaluation metrics. I have translated a sample of the English proceedings of the European Parliament into Dutch and compared it to reference translations from the Dutch proceedings. I also compared them to translations of the same texts using the Babelfish translation service. I concluded that statistical machine translation can provide a rough translation from English to Dutch, but it requires help from other translation techniques to produce a well-formed sentence.
منابع مشابه
The Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language
Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...
متن کاملIntegrating Source-Language Context into Log-Linear Models of Statistical Machine Translation
The translation features typically used in state-of-the-art statistical machine translation (SMT) model dependencies between the source and target phrases, but not among the phrases in the source language themselves. A swathe of research has demonstrated that integrating source context modelling directly into log-linear phrasebased SMT (PB-SMT) and hierarchical PB-SMT (HPB-SMT), and can positiv...
متن کاملA new model for persian multi-part words edition based on statistical machine translation
Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...
متن کاملA Gold Standard for English-Swedish Word Alignment
Word alignment gold standards are an important resource for developing and evaluating word alignment methods. In this paper we present a free English–Swedish word alignment gold standard consisting of texts from Europarl with manually verified word alignments. The gold standard contains two sets of word aligned sentences, a test set for the purpose of evaluation and a training set that can be u...
متن کاملEvaluating the Quality of Web-Mined Bilingual Sentence Pairs
We come up with the problem of evaluating the quality of bilingual sentence pairs mined from the web, which is critical for a wide range of applications such as statistical machine translation (SMT) and English as Second Language (ESL) learning. To address this problem, we propose a novel method that integrates multiple linguistic features related to spelling, grammar, alignment, and particular...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004